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ABSTRACT 

We present the largest homogeneous survey of z > 4.4 damped Lya systems (DLAs) 
using the spectra of 163 QSOs that comprise the Giant Gemini GMOS (GGG) survey. With 
this survey we make the most precise high-redshift measurement of the cosmological mass 
density of neutral hydrogen, Ohi- At such high redshift important systematic uncertainties in 
the identihcation of DLAs are produced by strong intergalactic medium absorption and QSO 
continuum placement. These can cause spurious DLA detections, result in real DLAs being 
missed, or bias the inferred DLA column density distribution. We correct for these effects 
using a combination of mock and higher-resolution spectra, and show that for the GGG DLA 
sample the uncertainties introduced are smaller than the statistical errors on rini- We hnd 
i^Hi = 0.98^g'^g X 10“^ at (z) = 4.9, assuming a 20% contribution from lower column 
density systems below the DLA threshold. By comparing to literature measurements at lower 
redshifts, we show that flni can be described by the functional form ex (1 + z)® "^. 

This gradual decrease from z = 5 to 0 is consistent with the bulk of HI gas being a transitory 
phase fuelling star formation, which is continually replenished by more highly-ionized gas 
from the intergalactic medium, and from recycled galactic winds. 
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1 INTRODUCTION 

The neutral hydrogen mass density of the universe, Dhi, is an im¬ 
portant cosmological observable. It determines the precision with 
which cosmological parameters can be constrained by observations 
of the HI intensity power spectrum (e.g. |Barkana & Loeb||2007[ 
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|Chang et al.|2008||Wyithe & Loeb|2008[|Padmanabhan et al.|20f5) >, 

and we expect its evolution to be linked to the cosmic star forma¬ 
tion history. The main contributor to Dhi is high column density, 
predominantly neutral gas clouds (e.g. lO’Meara et al.||2007| |Za^ 
|far et al.|2013|, self-shielded from ionizing radiation and therefore 
likely fuel for future star formation (e.g. |Wolfe et al.|2005) . Thus 
tracing the evolution of Dhi from the end of reionization, through 
the epoch of the cosmic star formation peak at z ~ 2 to the present 
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day is of central importance to our understanding of galaxy forma¬ 
tion. It also provides an excellent integral constraint against which 
theoretical models of galaxy formation can be tested. 

At redshift < 0.3, HI 21 cm emission can be used to measure 
flni either directly or by stacking analyses (e.g. |Zwaan et al.|2005| 
|Martin et al.|2010[ l. At higher redshifts, where emission is too weak 
to be detected with current facilities, flni can instead be inferred 
from the incidence rate of damped Lya systems (DLAs, defined as 
absorption systems with Ahi S? 20.3 cm“^), which trace the bulk 
of neutral gas in the universe ( [Prochaska et al.|2005| >. These systems 
are detected in absorption in the spectra of background QSOs, and 
their characteristic damping wings allow column densities to be 
measured even at low spectral resolution. 


Early DLA surveys at 2 < ^ < 4, which were typically com¬ 
prised of a few hundred QSOs and assumed a cosmological de¬ 
celeration parameter go = 0.5 or 0, suggested that the gas mass 
density in DLAs may have been sufficient to produce most of the 
stars seen in the local universe ( [Lanzetta et al.|19¥T] [Wolfe et al.| 
|1995||Storrie-Lombardi et al.|1996^ . However, a change to a mod¬ 
em concordance cosmology revealed that DLAs at 2 ~ 3 contain 
< 50 percent of the present day mass density in stars (e.g.|Storrie-| 
[Lombardi & Wolfe|2000[|Peroux et al.|2005| see also Section [5.2[ l. 
In addition, recent DLA surveys at 2 < a < 4 using more than 
10,000 QSOs assembled from the Sloan Digital Sky Survey (SDSS) 
( [Prochaska & Herbert-Fort[2004[ [Prochaska et al.|2005[[Prochask^ 


[& Wolfe 2009[ Noterdaeme et al. 2009 20 12^ have shown that 
there is very little evolution in the Hi mass density from z = 3 
to the present day. This is starkly at odds with the strong evolu¬ 
tion in the star formation rate over the same period (e.g. [Madau| 
|& Dickinson||2014|l. One view is that Hi represents a transitory 
phase fuelling star formation (e.g. [Prochaska et al.|2005[[Dave et ak] 
[2013[ (, which is continually replenished by more highly ionized gas 
from either the intergalactic medium (IGM) or recycled galactic 
outflows. 


While it is important to constrain Ohi across the whole of 
cosmic history, it is of particular interest at the highest redshifts. 
[Rafelski et al.[ ( [2014[ l report a decrease in the metal mass density 
in damped Lya systems from z = 5 to 4.5, hinting at an abmpt 
change in the enrichment of HI gas past z = 5. This may be caused 
by a change in the population of objects containing neutral hydro¬ 
gen, which could be accompanied by a similarly abrupt evolution in 
Dhi. Moreover, since massive stars in galaxies are believed to have 
reionized the Universe (e.g. jBouwens et al.[2012[ l, it is important to 
track the evolution of the fuel for star formation up to the epoch of 
reionization. However, it is a challenge to assemble the large sam¬ 
ple of high-redshift QSO spectra necessary for a z > 4.5 DLA 
survey. The decline in the QSO space density at 2 > 3 means that 
relatively few redshift > 4.4 QSOs were observed by the SDSS, 
and those that were typically have too low a S/N to reliably identify 
DLAs. For example, [Rafelski et al.[ ( [20T2l[2014| l find a misidentifi- 
cation rate of 26% for DLA candidates from SDSS DR5 at 2 > 4, 
and of 97% for candidates from DR9 at 2 > 4.7. For this reason 
smaller DLA surveys have been performed at higher redshift, of¬ 
ten using higher resolution spectra to make robust identifications of 
DLAs.lPeroux et ar]j2003[ l, [Guimaraes et al.[ ( [2009[ l and [Songaila &[ 
Cowie|i 20K))| ha^ all presented measurements of Ohi at 2 > 4.5. 


Songaila & Cowiq([2010[ hereafter SIO) give a cumulative result 


including data from all these previous studies, and this represents 


the highest redshift measurement of Ghi to date. They use a sample 
of 19 QSOs with emission redshifts > 4.5, and their measurement 
hints at a possible downturn in Ghi at 2 ^ 4, but the uncertainties 
from sample variance at 2 > 4.3 are large. 

Here we measure Ghi as traced by DLAs at 3.5 < 2 < 5.4 
using a homogeneous sample of 163 QSOs with emission redshifts 
between 4.4 and 5.4. This represents an increase in redshift path of 
a factor of eight over SIO at 2 > 4.5. Identifying DLAs becomes 
increasingly difficult at higher redshift, as HI absorption from the 
highly-ionized intergalactic medium (IGM) becomes more severe, 
and blending with strong systems below the DLA threshold can 
cause misidentification of DLAs. Therefore we carefully check for 
systematic misidentifications in our sample using both mock spec¬ 
tra and higher resolution spectra of DLA candidates. More than 
70% of our DLA candidates (and > 85% at 2 > 4.5) have been 
observed at higher resolution ( [Rafelski et al.|2012|[2014[ l, allowing 
us to confirm their Nui despite the increased IGM blending at high 
redshift. 

This paper is structured as follows. In Section 1^ we describe 
the QSO spectra used for the analysis. Section [sTdescribes the 
formalism used to derive Ghi from our observations and Sec¬ 
tion 1^ describes our method for measuring the DLA incidence 
rate, accounting for systematic effects. Section describes our 
main result, a measurement of the neutral hydrogen mass den¬ 
sity at 2 = 5, and discusses its implications. Section sum¬ 
marises our conclusions. We assume a flat ACDM cosmology, with 
Ho = 70kms“^Mpc~'^, = 0.3 and Ga,o = 0.7. All dis¬ 

tances are comoving unless stated otherwise. The data and code 
used for this paper are available at https://github.com/ 
nhmc/GGG_DLA 


2 DATA 

Our main data sample consists of GMOS spectra for the 163 QSOs 
which comprise the Giant Gemini GMOS (GGG) survey ( [Worseck] 
[et al.[|201^ . The QSOs were taken from the SDSS and all have 
emission redshifts 4.4 < 2 < 5.4. At these emission redshifts, 
the QSO sightlines are likely unbiased regarding the number den- 


sity of DLAs, unlike sightlines with 2.7 < 2em < 3.6 i 

Prochaska| 

[et al.[2009|[Worseck & Prochaska[201 l|[Fumagalli et al. 

2013J. We 


also use a smaller sample of 59 QSOs with higher resolution spec¬ 
tra, listed in Table[^ In contrast to the GGG sample, most of these 
QSOs were targeted because of a known DLA candidate towards 
the QSO. One of these higher resolution spectra was taken with the 
Magellan Echellette Spectrograph on the Magellan Clay Telescope 
( [Jorgenson et al.|2013[ l and the remainder were taken with Echel¬ 
lette Spectrograph and Imager on the Keck II Telescope ( [Rafelski] 
[et al.|20f2j[2014[ l. 39 of these QSOs are also in the GGG sample, 
and the remaining 20 have a similar emission redshift to the GGG 
QSOs. We use these higher resolution spectra to assess the relia¬ 
bility of our DLA identifications and to estimate the importance of 
systematic effects, but they are not included in the statistical sam¬ 
ple used to measure Ghi. Figure [T] shows the QSO emission red¬ 
shift distribution for our sample and the redshift path, g{z), where 
DLAs can be detected in comparison to previous high-redshift sur¬ 
veys. We define 

g{z) = - z)H{z - 2 f") (1) 
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where H is the heaviside step function, and and are red- 
shift limits for detecting DLAs in each QSO spectrum (e.g. |Zafar| 
|etal.|2013| >. 

For a detailed description of the GGG spectra and the pro¬ 
cedure used to reduce them, see |Worseck et all] ( |2014| l. In brief, 
they were observed with the Gemini Multi Object Spectrometers 
on the Gemini telescopes, yielding a typical S/N ~ 20 per 1.85 A 
pixel in the Lya forest at a resolution of ~ 5.5 A (full width at 
half maximum, FWHM). The spectral coverage was tuned to be 
roughly constant in the quasar rest frame (typically 850-1450 A). 
The high-resolution ESI spectra we us^have a typical S/N of 15 
per 10 km pixel and a resolution FWHM of 31 km (see Ta¬ 

ble]^. The single MagE spectrum has a similar S/N but a resolution 
of 56 kms“^. 


3 FORMALISM 


Our aim is to measure the cosmic Hi mass density at 3.5 < 2 < 
5.4. The bulk of the neutral gas at 2 < 2 < 5 is in DLAs, with a 
~ 15% contribution from sub-damped Lya systems (which have 
10^® < /VHi/(cm“®) < 10®° ®) and more highly ionized Ly 


limit and Lya forest absorbers with Nm < 10^®cm ® (Peroux 


et al.|20031|Prochaska et al.|2005[[0’Meara et al.|2007||Zafar et al. 

2013| l. There are several ways to express the comoving mass den¬ 
sity of neutral hydrogen used in the literature. For measurements 
at low redshift using radio emission, authors typically quote Ohi, 
which is the mass of neutral hydrogen alone, excluding any mass in 
molecules and helium. For DLA absorption studies, authors gener¬ 
ally quote the gas mass in DLAs, (sometimes the g subscript 

is omitted) i ncluding a factor ^ to account for helium. |Prochaska| 
et al.ji 2005 1 advocate using the quantity which is the mass 

in predominantly neutral gas, which can be different from 
In this work we quote the mass density from Hi alone, Hhi, and 
exclude any mass contribution from helium or molecules. Due to 
contamination and the low resolution of the GMOS spectra, we 
only measure Hi in DLAs, To convert to Hhi we apply 

a correction derived from measurements of lower Nm systems in 
previous work. 

We measure by counting the incidence rate of DLAs in 

the spectra, and measuring Nm from their strong damping wings. 
Below is a summary of the formalism used to derive from the 

DLA incidence rate. See section 4.1 of |Prochaska et al.] ( |2005| l and 
the review by | Wolfe et al.| ( [^05| > for a more detailed description. 

The number of DLAs in the intervals (Nm, Nm + dNm) 
and {X, X + dX) is defined as the frequency distribution, 
fohAiNm, X)dNmdX. Here X is the ‘absorption distance’, de¬ 
fined such that a non-evolving population has a constant absorption 
frequency: 

dX= + dz 


H{z)' 


( 2 ) 


where H is the Hubble parameter. The DLA incidence rate is then 


fcLA(X)dX = 


/■ 

J N^ 


fm.A{Nm,X)dNmdX. 


(3) 


^ The reduced spectra are available at http: / /www. raf elski . com/ 
data/DLA/hizesi 


It is related to the comoving number density of DLAs, nDLA(^), 
and the proper absorption cross section, A{X), by 

fDLA(V) = ^nDLA(V)A(X). (4) 

No 

Since DLAs are mostly neutral, the Hi mass per DLA is 
miiNmA{X), where mn is the hydrogen atom mass. Combining 
this with equationj^gives 


nl\^{x)dx=^° 


c Pcrit.O JNm,, 

SttG mn 
'M~o~ 


f 

J N^ 

r 

JiVi 


NmfDi.AiNm,X)dNmdX 

/VHl/DLA(/VHI,V)dAHldX. 

(5) 


/Vni.min = 10®° ®cm~®, so this expression does not include the 
contribution from lower TVhi systems to Hhi. We discuss how we 
include this contribution in section|T2] 

Due to the low resolution of the GMOS spectra, confusion 
from the strong Lya forest absorption at 2 > 4, uncertainty in 
the continuum level, and systematics affecting sky subtraction, the 
measured frequency of DLAs, fmeasiNm), may differ from the 
true /dla. Therefore we introduce a correction factor k{Nm) such 
that 


/dLa(/Vhi) = /meas(A^Hl)fc(A^Hl). (6) 

k{Nm) is the result of at least two effects. First, some systems 
flagged as DLAs will actually be spurious (false positives), and 
some real DLAs will be missed (false negatives). We estimate 
k{Nm) in the following way. Let Wcand be the number of DLA 
candidates flagged in our QSO survey. Wcand.true of these candi¬ 
dates will be real DLAs, and the remainder will be spurious. If 
Wtrue is the true number of DLAs in the spectra, then we can 
denote the fraction of DLA candidates which are not spurious as 
fcreai = Wcand.true/Wcand, and the fraction of true DLAs that are 
correctly identified as fefound = Wcand.true/iVtrue. This gives 

/dLa(/Vhi) = /meas(A^Hl)Ty2^ (7) 

^found 

and thus k{Nm) = fcreai/fcfound. In the following sections we de¬ 
scribe how we measure /meas, and how high-resolution and mock 
spectra are used to estimate fcreai and fcfound. 


3.1 Other systematic effects contributing to k{NH 


In measuring k{Nm) we explicitly take into account the rate of 
spurious DLAs (false positives) and missed DLAs (false negatives). 
There are several other systematic effects which could also con¬ 
tribute to fc(AHi), which we discuss here. 

The first of these is any uncertainty in the Nm measurements. 
If there are large uncertainties in Nm, or systematic offsets in the 
Nm estimated from the spectra as a function of Nm, this may 


change the inferred f{Nm)- However, in section 4.3 we show that 


the Nm error from the GMOS spectra (0.2 dex) does not have a 
detectable systematic bias, and section shows that any errors it 
introduces to Ohi are negligible compared to other uncertainties. 
A related effect is for Nm measurements at the DLA threshold of 
Nm= 10®° ® cm~®, where the more numerous lower column den¬ 
sity systems may be counted as DLAs through Nm uncertainties. 
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Figure 1. The left panel shows the emission redshift distribution for QSOs in the low-resolution GGG sample (open histogram), and for the subsample of these 
QSOs targeted with higher resolution spectra. The right panel shows the redshift path, g{z), for detecting DLAs for the GGG sample. g{z) is defined as the 
number of QSOs where a DLA can be detected as a function of DLA redshift. For comparison, g{z) for previous high-redshift DLA surveys are also shown; 
for |Peroux et al.|j2003) and for z > 4. 5 QSOs from SIO. We do not show g{z) for the SDSS DLA surveys (e.g. |Noterdaeme et al.|2012) . Their g{z) formally 
extends to z > 4, but |Prochaska et al.H2005) warn this high redshift sensitivity should be viewed conservatively and |Noterdaeme et al.|j2012) do not include 
DLAs with z > 3.5 in their statistical sample. 


This bias is a net source of false positives, and so should be taken 
into account by our procedure for estimating fcreai- 

A second possibility is the presence of dust in DLAs. If DLAs 
contain large amounts of dust they are able to extinguish the light 
from a background QSO, removing these sightlines from our sur¬ 
vey. In this case we would measure a lower incidence of high metal- 
licity, high Nui DLAs, which presumably contain the most dust. 
However, several studies have shown that most DLAs are not as¬ 
sociated with significant amounts of dust (e.g. |Murphy & Liske| 
|2004|[v!adilo et al.|2008t , and DLAs towards radio-selected QSOs, 
which are insensitive to the presence of dust, have a similar Nm 
distribution to those in optically-selected QSOs jEllison et al.|2001| 
[Jorgenson et al.|2006[ l. [Pontzen & Pettini| ( |2009| find that the cos- 
mic HI mass density may be underestimated by 3-23% at 2 ~ 3 
due to selection biases from dust. We do not include this relatively 
small effect in our analysis, but note where its inclusion would af¬ 
fect our conclusions. 

Gravitational lensing may also introduce a bias. DLA host 
galaxies may lens background QSOs, making them more likely to 
be found in our survey. This would result in brighter QSOs be¬ 
ing more likely to show foreground DLA absorption compared to 
fainter QSOs. At 2 ~ 3, [Murphy & Liske]p004| > found evidence 
at the ~ 2a level that DLAs tend to be found towards brighter 
QSOs. [Prochaska et al.[ ( [20031 > found a higher incidence rate of high 
A^hi DLAs towards brighter QSOs compared to fainter QSOs over 
a redshift range 2^.5, that resulted in a significant (> 95 percent) 
difference in Ohi between the two samples. They attributed this 
effect to gravitational lensing. We confirm that this effect is also 
present in our sample (which has some overlap with the Prochaska 
et al. sample): there is a 25 ± 15 percent higher incidence rate of 
DLAs towards QSOs with 2 -band magnitude 19.2 compared to 
QSOs with 2 > 19.2 mag. DLAs towards bright QSOs also tend 
to have high A^hi, resulting in a 30 percent increase in Ohi for 
the brighter compared to the fainter QSO sample. The significance 
of the excesses we measure is modest (1.7cr), and a Kolmogorov- 


Smirnov test between the Nm distributions towards 2 ^ 19.2 and 
2 > 19.2 mag quasars yields D = 0.3 and a probability of 22% 
that the two samples are drawn from the same underlying distri¬ 
bution. Therefore, while this difference hints at a selection effect 
related to the background QSO brightness, we cannot yet rule out a 
simple statistical fluctuation. We further discuss how this possible 
bias may affect our Ohi measurement in Section [5.Ll[ 

3.2 Conversion from to tint 

Previous absorption studies have shown that the dominant contri¬ 
bution to Ohi is from DLAs. Lower column density systems also 
contribute an appreciable fraction of Ohi, however. This fraction 
is 15-30% at 2 = 3, depending on the assumed Nm distribution 
(e.g. [O’M eara et al.|2007[ |Noterdaeme et al. [200^ [Prochaska et al.] 

[2010[|Zafar et al.[[2013^ . To parametrize this uncertainty, we in¬ 

troduce a correction factor 5 hi = Ohi/Dhi’^ to convert between 
which we measure, and Qhi- We assume the Nm distri¬ 
bution at 2 > 4 is not dramatically different from that at 2 ~ 3 
and take 5 = 1.2, which implies a 20% contribution from lower 
column density systems. Zafar et al. find the contribution of sub¬ 
damped systems to Dhi increases with redshift, possibly due to a 
weakening of the UV background as the number density of QSOs 
drops at high redshift. Therefore a goal of future surveys should be 
to measure the contribution of these sub-damped systems at 2 > 4. 

4 METHOD 

4.1 Procedure for identifying DLAs 

We measure the frequency of DLAs, /meas, by identifying DLA 
candidates by eye in the GMOS spectra, and then correcting for any 
biases in identification using mock spectra. To identify candidates 
we performed the following steps for each QSO spectrum: 
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(i) Estimate the continuum as a spline, placing the spline knot 
points hy hand. We used the low-z composite QSO spechum from 
|Shull et ^ ( |2012| to indicate the position of likely QSO emission 
lines which fall inside the Lya forest. 

(ii) Look for a possible damped Lya line in the Lya forest 
between the QSO Lya and Ly/3 emission lines. Estimate its red- 
shift and Nui by plotting a single component Voigt profile with 
b — 30 kms“^ over the ^ectrum, and varying Vhi and 2 until 
it matches the data by ey^ If necessary the continuum was var¬ 
ied at the same time Vhi was estimated to obtain a plausible fit. 
At higher redshifts, blending with IGM absorption can make esti¬ 
mating Vhi challenging, as the damping wings can be very heavily 
blended with IGM absorption. In this case the best constraint on 
Nhi is not from the shape of the damping wings, but instead from 
the extent of the Lya trough consistent with zero flux, and from 
any higher-order Lyman transitions. 

(iii) If a candidate DLA is found based on the Lya profile, use 
its higher-order Lyman series (if available in the spectrum) to refine 
its redshift and Ahi- 

(iv) Repeat steps (ii) & (iii) for all DLA candidates in the Lya 
forest. 


DLA absorption can also be detected bluewards of the QSO 
Ly/3 emission line. However, we chose to search only between Lya 
and Ly/3 emission in our sample to maximise the chance of having 
useful Lyman series lines in addition to Lya, and to avoid any ad¬ 
ditional systematic effects caused by further blending with the Ly/3 
forest. While most DLAs also have associated metal lines detected 
by the GMOS spectra, we did not use any metal line information 
when measuring the DLA candidate redshift or Nhi - This was done 
to avoid any bias against finding low metallicity systems, which 
may not have detectable metals in the GMOS spectra. 

Two of the authors (NHMC and JXP) searched the spectra 
for DLAs independently. The above steps were done either using 
custom-written Python code, or with X_FITDLA from XIDL, de¬ 
pending on which author performed the search. Eor each QSO we 
also noted any properties of the spectrum which might complicate 
the identification of DLAs, such as the presence of broad absorp¬ 
tion lines associated with the background QSO, or of possible prob¬ 
lems with the sky background subtraction. Two example DLA can¬ 
didates are shown in Pigure|^ In these two cases, higher resolution 
spectra confirm that both candidates are indeed DLAs. The Vhi and 
redshift estimated from the GMOS spectra differ slightly from the 
values inferred from the higher resolution spectra - we discuss this 
issue further in Section l43l Once we assembled a list of DLA can¬ 
didates, we selected only those within a redshift path limit defined 
by: 


-2min 

-^max 


'^Lya 

(1 -I- 2:qso)(l - Sv/c) - 1 


( 8 ) 


where Auya = 1215.6701 A, Any/s = 1025.72 A and Sv = 
5000 km s“^. This Sv was chosen to exclude ‘proximate’ DLAs, 
whose incidence rate is likely affected by a combination of ion¬ 
izing radiation from the background QSO, and by the overdensity 


^ This b value was chosen for convenience. The precise b used does not 
strongly affect the Lyo profile. 


associated with the QSO host galaxy halo (e.g. Ellisonel^a y2002| 


[Russell et al.|200^|Prochaska et al.|2008|[Ellison et al.|2010| (. Ta- 

ble|^ lists the redshift path limits used for each QSO in the GGG 
sample. We then convert the redshift path for each QSO to an ab¬ 
sorption distance path using equation 

With these DLA candidate lists we can derive the measured in¬ 
cidence rate of DLAs, /meas. However, despite our attempt to take 
continuum uncertainties and IGM absorption into account when 
measuring Nhi for each DLA, large systematic uncertainties may 
remain. The following sections describe how we quantify these un¬ 
certainties using the correction factors fcreai and fcfound to /meas- 


4.2 Estimation of fcreai and kfound 

We expect fereai to be less than unity, meaning that there are some 
spurious DLA candidates. The rate of these spurious candidates is 
estimated in two ways. Eirst, we use the sample of higher-resolution 
spectra to identify DLAs, and compare these with the DLA candi¬ 
dates found in the low-resolution sample. Second, we create mock 
low-resolution spectra which closely match the GMOS spectra and 
contain DLAs generated from a distribution at 2 = 3, and then 
search these spectra for DLAs in the same way as the real spectra. 

fefound is also expected to be less than unity, which means 
some true DLAs exist which we do not flag as DLA candidates 
in the low resolution spectra. Again we estimate the fraction of true 
DLAs recovered in two independent ways, using higher resolution 
spectra and mocks. In the first case DLAs identified in the higher 
resolution QSO spectra were used as a reference list of true DLAs, 
and compared to the candidate DLAs found in the lower resolu¬ 
tion spectra of the same QSOs. In the second case we used mock 
GMOS spectra, which allow us to directly compare known DLAs 
in the spectra to the DLA candidates. 

Our motivation for using two different ways to estimate the 
correction factors (mocks and high resolution spectra) is to test dif¬ 
ferent systematic effects. The main advantage of the mocks is that 
the true DLA properties are known precisely. However, while we 
attempt to reproduce the real spectra as closely as possible, includ¬ 
ing Lya forest clustering, QSO redshift and signal-to-noise distri¬ 
bution, it is still possible that the mocks may differ from the real 
GMOS spectra. Metal absorption (not included in the mocks) or 
clustering of strong absorbers that is different to the mocks may 
cause more spurious DLAs. Alternatively, non-Gaussian noise in 
the real spectra at low fluxes may mean that true DLAs are more 
likely to be missed in the real spectra. Conversely, for the high- 
resolution sample the true DLA properties are not known with com¬ 
plete certainty, but the correct clustering, IGM blending, noise and 
metal absorption are all included. Therefore these two approaches 
provide complementary estimates of fcfound and fcroai. The follow¬ 
ing sections describe these approaches in more detail. 


4.2.1 Corrections using high resolution spectra 

DLAs can be found more easily in our sample of high resolution 
spectra, and their Nhi and redshift are more accurately measured, 
in comparison to the lower resolution GMOS spectra. Therefore 
we independently identify DLAs in these spectra for the purpose 
of deriving the correction factors fcreai and fefound, and to test for 
any systematics in estimating Nhi and 2 for each DLA. When 


© xxxx RAS, MNRAS 000, [T]-?? 









6 N. Crighton et al. 



-10000 -5000 0 5000 10000-300 0 300 -300 0 300 

At) (km ) At) (km ) At; (km ) 



-4000 -2000 0 2000 4000 -300 0 300 -300 0 300 

At) (km ) At) (km ) At; (km ) 


Figure 2. DLAs identified in the GMOS spectra (resolution FWHM ~ 230 km s“^) which are confirmed in higher resolution ESI spectra (resolution FWHM 
~ 30 km s“^). In each case the top panels show the GMOS spectrum and the bottom panels the ESI spectrum of the same QSO. The model shows the Afjji 
and redshift estimated from the ESI spectra with the redshift fixed by low-ion metal absorption. The shaded region shows an uncertainty in log Ahi of 0.2. 
The IVhi and redshift estimated from the GMOS spectra are given in Table|^ 


identifying the DLAs in the 59 high-resolution spectra we follow 
the same process outlined for the lower-resolution spectra in Sec¬ 
tion |4^ using the Lyman series to estimate the redshift and (Vhi. 
However, we also refine the redshift and Nm using the position of 
low-ionization metal lines (O I, Si II, C II and Al II) where possible. 
For the 20 QSOs with high-resolution spectra which are not in the 
GGG sample, we created low-resolution spectra by convolving the 
high-resolution spectra to the same FWHM resolution, and rebin¬ 
ning to the same pixel size as the GMOS spectra. The same noise 
array was used for these spectra as for the GGG QSO with a redshift 
closest to each QSO, normalising such that the median S/N within 
rest-frame wavelengths 1260-1280 A match. These low resolution 
spectra were searched for DLAs in the same way as the GMOS 
spectra. 

In this way we made two lists of DLAs, one from the high 
resolution spectra, and another from low-resolution spectra of the 
same QSOs. The DLAs identified in the higher resolution sam¬ 
ple are listed in columns 5 and 6 of Table We then estimated 
fcroai as Wcand.true/Wcand, where Acand is the number of DLA 
candidates from the low-resolution spectra, and Acand.true is the 
number of those candidates confirmed to be DLAs by the high 
resolution spectra, fefound is estimated as Acand,true/Atrue, where 


Atrue is the number of DLAs found in the high-resolution spectra 
and Acand.true is the number of those also flagged as DLA candi¬ 
dates in the low resolution spectra. We calculate the binomial con¬ 
fidence intervals on fcreai and fefound using the method described by 
|Cameron| ( |2011| l. 

With this procedure we find fereai = O.SOto'og and fefound = 
0.84(')Q Qg using DLAs identified by JXP (see Figures js with 
similar values found by NHMC. Both are below unity, and so there 
are both spurious DLA candidates, and real DLAs missed. Spu¬ 
rious DLAs usually occur when flux spikes are smoothed away 
at GMOS resolution, making a lower Ahi system appear to have 
strong damping wings. An example spurious DLA is shown in Fig¬ 
ure]^ Real DLAs are generally missed due to flux fluctuations in 
the core of the Lya line: an example is shown in Figure]^ 


4.2.2 Corrections using mock spectra 

Our method for generating mock spectra is described in Ap¬ 
pendix]^ In this case the Ahi for each DLA is known, and so 
can be directly compared to the candidates identified in the low- 
resolution mocks. Again fereal is estimated as Acand,true/Auand, 
where Acand is the number of DLA candidates from the low- 
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Figure 3. Example of a spurious DLA candidate. This was identified as a DLA with Nm = iq20.4,±0.2 in the GMOS spectrum shown in the top 

panels. However, the residual flux spikes at Lya and Lyy in the higher resolution (FWHM ~ 30 km s“^) ESI spectrum in the bottom panels show this system 
must have A^hi < cm“^. 
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Figure 4. A DLA that was not con'ectly identified in the GMOS spectra. Lower panels show the DLA in the ESI spectrum, with Ahi = cm“^. 

The residual flux in the core of the Ly« line in the GMOS spectrum, however (top left panel), meant this system was missed. This residual flux around 
velocities Av ~ 0 km may be caused by either statistical fluctuations or systematics associated with sky background level. 


resolution mock spectra, and Acand,true is the number of those 
candidates that are DLAs. fcfound is estimated as Acand.true/iVtrue, 
where Atrue is the tme number of DLAs in the mocks and 
A^cand,true is the number of those recovered as DLA candidates. 
Again we calculate the errors on fcreai and fcfound assuming a bino¬ 
mial confidence interval. For the mocks we find fcreai = 0.71±0.06 
and fcfound = 0.921 q q 7 using DLAs identified by JXP (see Figures 
Similar values are found by NHMC (see Figures [A2l|A3^ . 


4.2.3 Comparison of correction factors and their dependence on 
redshift and column density 

We expect fereai and fcfound to be a function of a DLA’s Ahi (high 
Ahi candidates should be more reliable), spectral S/N (low S/N 
spectra will produce more spurious candidates) and redshift (more 
spurious DLAs will be found at high redshift where there is more 
IGM absorption). The most important of these for our measurement 
of Dhi is any redshift or Ahi dependence. |Noterdaeme et al.H2009} 


and |Noterdaeme et al.|P0l'^ show that at z ~ 2.5, systems with 
Am = 10^'^ ®“^^ ® cm“^ rnake the largest contribution to flni- 
Thus we expect completeness corrections in this column density 
range to have the largest effect on the final derived Dhi[^ 

The top panels of Figure]^ show the correction factor /cfound 
from the high-resolution spectra binned by the trae DLA redshift 
and Ahi, and the bottom panels show the same correction factor es¬ 
timated from the mocks. Figurej^shows the correction factor fc^eai 
binned by the candidate DLA redshift and Ahi, again for the high- 
resolution spectra and mocks. These are derived from DLAs iden¬ 
tified by one of the authors (JXP) who search the spectra for DLAs, 
but values for the other author (NHMC) are similar. There is no ev¬ 
idence for a strong dependence of fcreai or fefound on redshift, using 

® Due to our relatively small DLA sample, we may be missing some very 
high Am systems with Am JS 10^^ cm“^. These contribute only 10% 
of Dhi at 2 ~ 3 jNoterdaeme et al.|2012) and thus we do not expect their 
absence from our sample to strongly bias our results. 
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Figure 5. The fraction of true DLAs that were correctly identified by one 
of the authors (JXP), fefound. as a function of the true redshift and Afni. 
Top panels are for the high-resolution sample, bottom panels are for mocks. 
The upper row of numbers under each histogram gives the number of DLA 
candidates that are correct per bin, the lower row the total number of candi¬ 
dates. The total numbers for all bins are given at the top in the left panels. 
Vertical lines show the binomial 68% uncertainties. 


Figure 6. The fraction of non-spurious DLA candidates, fc^eal by ons of the 
authors (JXP), as a function of the candidate redshift and Ahi . Top panels 
are for the high-resolution sample, bottom panels are for mocks. The upper 
row of numbers under each histogram gives the number of true DLAs that 
are recovered per bin, the lower row the total number of true DLAs. The 
total numbers for all bins are given at the top of the left panels. Vertical 
lines show the binomial 68% uncertainties. 


either the high-resolution spectra or the mocks. However, there is a 
weak dependence of fcreai and fcfound on A/hi, with the lowest A/hi 
bin having a significantly lower fcreai than for higher A/hi bins. This 
matches our expectations: weaker candidate DLAs are more likely 
to be spurious, and true DLAs that are weak are more likely to be 
missed. We take this A/hi dependence into account when applying 
the correction factors as described in Section]^ We find no strong 
dependence of the correction factors on S/N in either the mocks or 
the high-resolution sample for the range of S/N the CMOS spectra 
cover. 

Figures and also show that corrections derived from the 
mocks and high-resolution spectra are in reasonable agreement. 
The main difference is in the number of spurious systems with 
A'hi ~ ]^q 20.3-20.6 gjjj-2 -pjjg j-jgjjj pajjels of Figure]^ show 
that there are more weak, spurious DLAs found in the mocks com¬ 
pared to the real GMOS spectra. However, we show in the follow¬ 
ing section that the correction factor in this A/hi range is not im¬ 
portant for estimating flni, and for the remaining bins the mocks 
and high-resolution corrections match to within 20 per cent. As we 
discussed earlier, the high-resolution sample and mocks test differ¬ 
ent systematic uncertainties which may affect Dhi. Therefore the 
consistency of the correction factors between these two methods 
suggests the mocks reproduce the true GMOS spectra well, and 
that DLAs have been identified correctly in the higher-resolution 
spectra. 


4.3 Uncertainties in A/hi and redshift 

If DLA column densities estimated from the GMOS spectra are 
systematically in error, our measurement of Ohi may be biased. 
Such a systematic could occur because of incorrect placement of 
the continuum, or blending of damping wings with the Lya forest. 
This is an additional effect not accounted for by the correction fac¬ 
tor, fc, to /meas. Therefore, we search for any systematic offset in 
A/hi by matching DLA candidates from the low-resolution spectra 
to known DLAs in the high-resolution sample and mocks. 

The results of this test are shown in Figure]^ The log A/hi 
difference is plotted as a function of redshift, the true A/hi, and 
S/N for the high-resolution sample (top panels) and mocks (bot¬ 
tom panels). For both the mocks and high-resolution samples, both 
A log A/hi and An are centred on 0. The standard deviation of the 
velocity and log A/hi offsets are 184/216 kms“^ and 0.165/0.196 
for the high-resolution sample and mocks, respectively. We there¬ 
fore adopt 0.2 dex as our uncertainty in A/hi. There is no trend seen 
with redshift or S/N. There may be a trend with TVhi, but above 
N = cm“^ it is too weak to significantly affect Hm. We 

conclude that there is no systematic bias in A/hi which might ad¬ 
versely affect the Dhi measurement. 

Figure]^ also shows the redshift difference between matched 
DLAs expressed as a velocity difference. DLAs identified in the 
higher resolution spectra use low-ionization metal lines to set a pre¬ 
cise DLA redshift with an error a few kms“^. Both the mocks and 
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Figure 7. The difference between A^hi estimated for DLA candidates in 
low resolution spectra (A^Hi,cand) ^nd Nm true measured from high res¬ 
olution spectra (top) or known from mock linelists (bottom), and similarly 
for the velocity offset from the true DLA redshift. This is for one of the 
authors (JXP), but the results for NHMC are similar. These show there is 
no strong systematic offset in the estimated Nm as a function of redshift, 
S/N or Ahi which might systematically bias Qm significantly. Grey shad¬ 
ing shows regions that cannot be populated due to the requirement that both 
/Vfil.cand and Nni.true ai-e > cm“2. 


high resolution sample show that an uncertainty of ~ 200 kms“^ 
results from estimating redshifts using Lyman series absorption 
alone (without reference to metal absorption) in the low-resolution 
spectra. 


4.4 DLA incidence rate and differential Nm distribution 

Figure shows the differential Nm distribution from the GGG 
sample compared to that from the SDSS sample from |Prochaska| 
| & Wolfe |(| 2009 t, which is consistent with the more recent estimate 
from lNoterdaeme et al. 120121 We have four different measurements 
of the correction factor fc(iVHi), from two different authors using 
the mocks and high-resolution spectra, so there are four different 
estimates of f{Nm, X). We find the final f{Nm, V) by averaging 
these four estimates. The uncertainties on this value include a statis¬ 
tical and systematic component. The statistical uncertainty is found 
by bootstrap resampling, using 1000 samples from the observed 
DLA distribution, and averaging these uncertainties for the four 



Figure 8. The column density distribution, f{Nm), for the GGG sample 
of DLAs. Black points show the measurements with comection k{Nm) 
applied. Both are slightly offset in Ahi for clarity. Red points show the 
SDSS DR5 measurements from |Prochaska & WolfeH 2009^ af ter applying 
the correction to the redshift search path recommended by|Noterdaeme et al.| 
|2009| . The errors are Icr, and include statistical and systematic errors (see 
section[4~4]for more details). 


different estimates. The systematic uncertainty is then assumed to 
be the standard deviation in the four estimates. These systematic 
and statistical components are added in quadrature to give the er¬ 
rors shown in figure]^ The two distributions are similar overall, 
although there is a clear discrepancy between the GGG and z = 3 
f{Nm,X) for the bin at log Am ~ 21.2, which hints at evolu¬ 
tion in the shape of f{Nm, X) at high redshift. Flowever, a simple 
change in the normalization is also consistent with the data. 

The DLA incidence rate, f(A'), is shown in Figure]^ This ob¬ 
servable is more sensitive to the lowest Ahi DLAs than Dm- Since 
the correction factors we derive are strongest for low Am DLAs 
and these DLAs have a strong effect on (.{X), we expect t{X) to 
be sensitive to the particular choices of correction factors. This is 
indeed the case - there are systematic differences at least as large 
as the statistical errors, and they depend on whether the mocks or 
the high resolution spectra are used to estimate the correction fac¬ 
tor. Similarly large differences are found between 1{X) by each of 
the two authors who searched for DLAs. The i{X) values we mea¬ 
sure are consistent with a smooth increase from 2 ; = 2 to 2 = 5. 
Flowever, since we do not know which k[Nm) correction factors 
are best, we do not attempt to present a definitive £{X) measure¬ 
ment here. A large sample of higher-resolution spectra, where low 
column density DLAs can be identified with more certainty, will be 
necessary to robustly measure £{X) at 2 > 4. 

We can still make a more robust measurement of Dm, how¬ 
ever, regardless of the uncertainty in £(X), as Figure [T^ illustrates. 
DLAs with the largest contribution to Dhi have Ahi in the range 
® cm“^, and DLAs with lower Ahi make a substan¬ 
tially smaller contribution. Therefore, while systematic effects may 
give rise to a large uncertainty in the number of low column den¬ 
sity systems (and thus £(X)), Dhi can still be measured accurately. 
This point is discussed further in Section]^ 
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Figure 9. The DLA incidence rate 1{X). This observable is more sensitive 
to the lowest Ahj DLAs than Ojji ■ Grey points show the uncorrected GGG 
measurement, and black squares with the corrections applied. Each panel 
shows a different correction, using either mock or high-resolution spectra 
for two different authors. There are systematic differences comparable to 
the statistical errors, and they depend on whether the mocks or the high res¬ 
olution spectra are used to estimate correction factors. Similar differences 
are also found between the two different authors who searched for DLAs. 
These illustrate that signihcant systematic uncertainties affect the measure¬ 
ment of (-{X). 

5 RESULTS AND DISCUSSION 
5.1 Ohi measurement 

We can now use the Ani-dependent correction factor k esti¬ 
mated in the previous section to find /dla and thus flni- For the 
GGG sample we count the number of DLAs in a given absorp¬ 
tion path, giving each DLA a weight k{Nm), where fc(iVHi) = 
fcfound(AHi)/fcreai(AHi). k{N^i) is then estimated as the ratio of 
the logiVni histograms shown in Figures]^ andwith the uncer¬ 
tainty on each bin given by the uncertainties in fcfound and fcreai 
added in quadrature. 

There are two main contributions to the final error on Ghi- 
The dominant contribution is the statistical error due to the finite 
sampling of DLAs: there are 25-30 DLA candidates in each red- 
shift bin, dependent on whether NHMC or JXP’s results are used. 
We estimate this error using 1000 bootstrap samples from the DLA 
sample. The second is the systematic uncertainty in the correction 
factor, fe(AHi). We estimate the effect of this uncertainty using a 



logio (cm ^ ) 

Figure 10. The differential Ghi distribution, cIGhi / (dXd log Afjjl ) ■ 
DLAs with the largest contribution to Ghi have Ahi in the range 10^®'®- 
1021.6 cm~^; DLAs with lower Ajji are less important. Therefore, while 
there is uncertainty in the number of low column density systems (and thus 
£(X)), Gjji can still be measured accurately. This is also illustrated by Fig- 
urellll 


Monte Carlo technique. Ghi is calculated 1000 times, each time 
drawing k{Nui) from a normal distribution with a mean given by 
the k{Nm) histogram bin value and a determined by the uncer¬ 
tainty on that bin, assuming no correlation between uncertainties 
in adjacent bins. Then the final error in Ghi is given by adding 
these two uncertainties in quadrature. We confirmed that TVhi error 
of each DLA (0.2 dex, see Section [43t , has a negligible contribu¬ 
tion compared to these statistical and systematic uncertainties. We 
also check that using Nui measurements from the high-resolution 
spectra, where available, does not significantly change Ghi. 

Since we have separate estimates of k{Nm) from the mocks 
and high resolution sample, and two authors performed these es¬ 
timates, we can make 4 different measurements of Ghi. We use 
these to gauge the effect on Ghi of estimating corrections from the 
mocks versus the high-resolution sample, or of any differences in 
the way the two authors identified DLAs. The results are shown 
in Figure [TT] The differences between the mocks compared to the 
high-resolution sample, and between the two authors, are signifi¬ 
cantly smaller than the uncertainty on any individual Ghi measure¬ 
ment. Therefore we conclude that neither the methods we use to 
estimate ^(Whi), nor any differences in DLA detection between 
methods, contribute a significant uncertainty to the final Ghi. We 
caution that this conclusion only holds for the sample of spectra we 
analyse. New tests of systematic effects may be required for mea¬ 
surements of Ghi using larger samples of DLAs, or using different 
resolution or S/N QSO spectra. 

For the remainder of the paper we use the measurement of Ghi 
derived using k from the higher-resolution sample and measured 
by author JXP, which is shown in the top-right panel of Figure [TT] 
This measurement and the 68% confidence interval is given in Ta¬ 
ble [T] We assume a 20% contribution to Ghi from systems below 
the DLA threshold, as described in SectiorjU^ 


© xxxx RAS, MNRAS 000,[T]-?? 





































Neutral Hydrogen mass density at z = 5 11 



Redshift 


Figure 11. Qhi measured by the two authors using the high-resolution sam¬ 
ple (top) and mocks (bottom). The uncertainties on Qm introduced by any 
differences in selecting DLA candidates between the authors, or between 
using mocks or the high-resolution sample, are much smaller than the er¬ 
rors shown, which are a combination of the statistical error and uncertainty 
in the correction factor (see section[^. 


z 

103 Dhi 

10 ^ Dhi (lo-) 

AX 

3.56-4.45 

1.18 

0.92-1.44 

356.9 

4.45-5.31 

0.98 

0.80-1.18 

194.6 


Table 1. Qhi for the GGG sample, assuming a flat cosmology with Hq = 
70 kms~^ Mpc“^ and ^ 772,0 = 0-3. The redshift bins were chosen to 
cover roughly equal redshift widths, and to yield approximately equal num¬ 
bers of DLAs in each bin. To convert between Hhi and Qg which is of¬ 
ten quoted by other DLA studies, use where/i = 1.3 

accounts for the mass of helium and 5 hi = 1-2 estimates the contribution 
from systems below the DLA threshold of cm~^. 


5.1.1 Is there a bias from gravitational lensing? 

There is a 30 zb 20% increase in flni for sightlines towards the 
brighter half of our QSO sample (z ^ 19.2 mag) relative to flni 
towards the fainter QSOs {z > 19.2 mag). If this effect is caused 
by gravitational lensing of a background QSO by a galaxy associ¬ 
ated with a foreground DLA, then our measured Qm will be artifi¬ 
cially enhanced. A detailed lensing analysis is beyond the scope of 
this work. However, if we follow [Menard & Fukugit^ j2Q12^ and 


assume the lensing DLA galaxies are isothermal spheres, we can 
estimate their Einstein radius as 


Co = 4.(^) 


2 DiDis 
Ds 


(9) 


where (t„ is the velocity dispersion, c is the speed of light and 
Di^s,is are the angular diameter distances from the observer to the 
lens and to the source, and from the lens to the source. Assuming 
a typical dispersion of 100 kms“^ we find the effective radius for 
lensing is very small, 0.1 kpc for a z = 4.5 DLA towards a 2 ; = 5 
QSO. This is half the radius for a DLA at 2 = 2.5 towards a QSO 
at 2 = 3.5. Since the magnitude of the increase in Ohi due to the 
putative lensing at 2 ~ 3 is relatively small (~ 20 ± 10 per cent, 
Prochaska et al. 2005) we do not expect it to have a large effect at 
higher redshifts. We conclude that it is more likely the difference 
in Ohi between the bright and faint QSO samples is caused by a 
statistical fluctuation, rather than a lensing bias. 


5.2 Comparison with previous measurements 


Several groups have made measurements of Ohi at 2 > 4.5 us¬ 
ing DLA surveys jPeroux et al.|2003||Guimaraes et al.|2009[ SIO). 
These are cumulative results - Ohi measurements from each new 
QSO sample are combined with older Ohi measurements which 
used a different DLA survey. While combining results in this way 
maximizes the statistical S/N of the final result, it results in a het¬ 
erogeneous sample of quasar spectra with different data quality and 
different DLA identification methods. As shown in sections |4] and 
[53 at 2 > 4.4 different identification methods can produce a sys¬ 
tematic uncertainty in Ohi which, although smaller than the statis¬ 
tical uncertainties for our current DLA sample, may still be con¬ 
siderable. Since these analyses did not use mock spectra to explore 
systematic effects, it is difficult to estimate the true uncertainty in 
Ohi when combining heterogeneous quasar samples with different 
selection criteria. In contrast, our sample has homogeneous data 
quality, QSO selection method and DLA identificatio^rocedure, 
and we use mock spectra to test any systematic effects]^ 

Figure [T^ shows our new results together with previous mea¬ 
surements of Ohi, converted to our adopted cosmology. When mul¬ 
tiple measurements of Ohi have been made using overlapping QSO 
samples and the most recent measurement uses a superset of pre¬ 
vious QSO samples, only the most recent measurement is shown. 
For example, the results of SIO include most of the quasars used 
by |Peroux et al.|(2003| > and |Guimaraes et al.H2009y so we show 
only the SIO result. In all such cases the most recent measure¬ 
ment is consistent with earlier results. Where previous DLA sur¬ 
veys have quoted "'s convert to Ohi using the relationship 

Ohi = 1.2n§}'^/1.3. Our measurement at ( 2 ) = 4 is higher than, 
but consistent with earlier measurements by SIO. As such and be¬ 
cause we find a possible systematic increase in Ohi towards bright 
QSOs, we checked whether the magnitude distribution of the SIO 
QSOs was lower than the GGG sample. 2 band data was not avail¬ 
able for the whole SIO sample, but the eight QSOs which overlap 
between their sample and ours have a similar fraction of QSOs with 


We note that eight of the QSOs used by SIO are also included in our 
sample, but the 155 remaining GGG QSOs are independent of previous 
samples. 
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z sC 19.2 and 2 > 19.2 mag. Therefore a difference in QSO mag¬ 
nitudes is unlikely to cause a difference between our result and the 
SIO result, and it seems more likely that the difference is caused by 
a statistical fluctuation. 

Our results at {z) = 4.9 give the most robust indication to 
date that there is no strong evolution in Ohi over the ~ 1 Gyr pe¬ 
riod from 2 = 5 to 2 = 3. We see a slight drop in Ohi between 
our 2 ~ 4 and 2 ~ 4.9 Ohi measurements, but this difference 
is not statistically significant. If the metal content of DLAs does 
change suddenly at 2 = 4.7, as suggested by |Rafelski et al.| ( |2014| ), 
there is no evidence it is accompanied by a concomitant change in 
Ohi. However, the uncertainties remain large and future observa¬ 
tions should continue to test this possibility. 

Figure[^also shows a power law with the form Ghi = 
zy fitted to the binned data. This simple function provides a rea¬ 
sonable fit (x^ per degree of freedom = 1.44) across the full red- 
shift range, with best-fitting parameters A = (4.00 ±0.24) X 10“"' 
and 7 = 0.60 ± 0.05. There is no obvious physical motivation for 
this relation, nor any expectation that it should apply at redshifts 
> 5. Nevertheless, it may provide a useful fiducial model to com¬ 
pare to simulations and future observations. 

We also compare our new high-redshift value to lower redshift 
Ghi measurements. As previous auth ors have noted (e.g.|Pr ochaska| 

|et al.|2005]|Prochaska & Wolfe|200^|Noterdaeme et al.|2009| l, Hhi 

evolves from 2 = 3 to 2 = 0 by factor of < 2, at odds with the very 
strong evolution in the star formation rate over the same period. 
Moreover, the drop in Hhi is much smaller than the increase in 
stellar mass over this period. Figure [T3] demonstrates this point by 
showing the increase in comoving mass density in stars from 2 = 5, 
p* — p* ( 2 ; = 5) and the contemporaneous decrease in HI comoving 
gas mass densitjQ = 5) “ P™ using the power law fit from 

Figuref^ The mass in stars is calculated using the expression from 
|Madau & Dickinson| ( |2014l l, and the range shows an uncertainty 
of 50%, indicative of the scatter in observations around this curve. 
While the evolution of Ghi from 2 = 5 to 2 = 3 remains uncertain, 
the Hi phase at 2 = 5 contains ample mass density to form all 
the stars observed at 2 ~ 3, and the evolution predicted by the 
simple power law function is consistent with this scenario. From 
2 ~ 3 to 2 ~ 0, however, there is a factor of 5-6 shortfall in HI 
mass density compared to amount needed to produce stars over the 
same period. This underscores that at 2 < 3, the Hi phase must 
be continually replenished by more highly ionized gas, presumably 
through a combination of cold-mode accretion (e.g. [Dekel et al.| 
|2009| > and recycled winds (e.g. |Oppenheimer et al.|2010| . The more 
highly ionized Lyman limit systems and sub-DLAs should then be 
important tracers of the interface between this HI phase and more 
highly ionized gas (e.g. |Fumagalli et al.|201 1^ . 

There are several reasons to expect the neutral fraction of the 
universe to evolve at 2 > 3. As we approach the epoch of reion¬ 
ization, the filling factor of neutral hydrogen in the universe should 
increase, as large pockets of the universe are no longer ionized. This 
is reflected in the decrease in the mean free path for H-ionizing pho- 


® In Figure [T^ the HI gas mass density p™ is used, which is related to the 
Hi mass density by = pppji with p = 1.3 and pm = Pcrit.oHm. 
We do not apply any correction for dust extinct ion by foreground DLAs . 
If this is present, it could increase pj^^ by 20% Pontzen & Pettini[2009|, 
which would not affect our discussion. 
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Figure 13. The increase in comoving stellar mass density from 2 = 5 to 0 
(from |Madau & Dickinson|2014| thin line and shading) and the correspond¬ 
ing decrease in HI gas mass density over the same period (thick line) using 
the fitting formula from section [^2] Before 2 ~ 3, the HI gas phase con¬ 
tains ample mass density to fuel all the observed star formation. However, 
from 2 ~ 3 to the present it contributes less than ~ 20% of the mass nec¬ 
essary to form stars, and so must be continually replenished by more highly 
ionized gas. 


tons ( [Fumagalli et al.|2013[ [Worseck et al.|2014[ > towards higher 
redshifts. While the bulk of reionization is thought to occur at 
2 > 6 , large neutral regions may persist to lower redshifts (e.g. 
[Becker et al.|20l5| l. Our results suggest that while regions of this 
kind may exist, they do not change the total neutral gas mass den¬ 
sity appreciably from that observed at 2 ~ 3. This is consistent 
with the conclusions of Becker et al., who find that by 2 = 5 the 
bulk of IGM absorption is due to density fluctuations instead of 
large, neutral regions yet to be reionized. 

This is perhaps not surprising. The distribution of these neutral 
pockets depends on the nature of reionization, which may progress 
from low-density regions to high-density regions (‘outside-in’) or 
the reverse (‘inside-out’), or some combination of the two (e.g. 
[Finlator et al.|201^ . However, favoured scenarios see the highest 
density regions with A = p/{p) S> 100 reionized first, as they 
are populated by galaxies, believed to be the dominant source of 
ionizing photons. In this case neutral pockets will persist only in 
underdense regions such as filaments or voids, with A < 100. At 
2 ~ 2.5 clustering measurements sugge st most DLAs are found 
inside haloes with masses 10'^''-10'^ M 0 (|Cooke et al. 


2006 


Font- 


[Ribera et al.|2012l ), which have a mean A > 100. Therefore even 
if large neutral regions do persist to 2 = 5, they may not occur at 
cosmic densities high enough to produce strong DLA absorption. 
The remnants of such regions may be observable as Lyman limit 
systems however, and so one might expect an increase in their inci¬ 
dence rate towards 2 ~ 5, which observations already hint may 
be the case jProchaska et al.[|2010[ [Fumagalli et al.[[2013^ . The 
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Figure 12. Measurements of at different redshift, from |Zwaan et al.H20Q5) ; |Rao et aL]j2006) ; |Lah et ^j20Q7) ; |BraunH2Q12) ; [Martin et ^^^2010) ; 
Noterdaeme et al.| l |2012^ ; [Rhee et al.H2Q13^ ; |Delhaize et al.H2013^ and sTo (see Table[^. We do not show the measurement using SPSS QSOs by|Prochaska 
& WolfeH2009J, it is consistent with the measurement by|Noterdaeme et al.|j2012|, who use a superset of SPSS QSOs. We also do not show thejPeroux et al. 
<2003) and|Guimaraes et al.|J2009| results, which have a large overlap with the QSO sample used hy S10 and are consistent with that measurement. Finally, for 
clarity we do not show the measurements at lower redshift from |Freudling et 31.^2011) and |Meiring et al.|j2011) ; they are consistent with the plotted values. 
All measurements have been converted to the same cosmology (h = 0.7, Qm =0.3, Ha = 0.7) and include HI mass only, with no contribution from helium 
or molecular hydrogen. 


GGG sample can also be used to measure the LLS incidence rate at 
z > 4, which we will present in a future work. 


5.3 Comparison with theory 

In Figure[T4|we show Ghi in comparison to some recent theoretical 
predictions for its evolution. These are byJLagos^etalUl 20^ using 
the semianalytic GALFORM model, by [Bird et al. i2014i from a 
simulation using the moving-mesh code AREPO, and by Tescari et 
al. ( |2009| see also [Duffy et al.|20T^ and jPave et al.jpOlAt using 
smoothed particle hydrodynamics (SPH) simulations. While these 
models broadly match the slow evolution of Ghi since 2 ; ~ 4, most 
struggle to reproduce the trend of decreasing Ghi with time (with 
Tescari et al. being a notable exception). Lagos et al. suggest that 
their model’s underestimation of Ghi at high redshift may be due 
to more neutral gas being found outside galaxy discs in the early 
universe. If this interpretation is correct, then our observations sug¬ 
gest that more than half the neutral gas mass (and more than half 
of DLAs) are found outside galaxies at 2 ~ 5. Alternatively, |Dave| 
jet al.| ( [20T3) l show that agreement between their simulations and ob¬ 
servations can be improved by assuming that a population of low 
mass galaxies, unresolved by current SPH simulations, make a sig¬ 
nificant contribution to the DLA absorption cross-section at high 
redshift. 

It is evident that further improvements are needed to theo¬ 
retical models to reproduce the evolution of Ghi across the full 
redshift range. If much of the neutral gas is found in galactic out¬ 
flows or recycled winds, the sub-grid prescription for outflows in 


z 

10^ Ghi 

Reference 

0 

0.375 ±0.061 

Zwaan et al. (2005) 

0 

0.548 ±0.091 

Braun (2012) 

0.026 

0.430 ±0.030 

Martin et al. (2010) 

0.028 

u.4tuo_o 084 

Delhaize et al. (2013) 

0.096 

U.400_q Qg4 

Delhaize et al. (2013) 

0.1 

0.33 ±0.05 

Rhee et al. (2013) 

0.2 

0.34 ±0.09 

Rhee et al. (2013) 

0.24 

0.70 ±0.31 

Lah et al. (2007) 

0.15-0.90 

0.88l°j® 

Rao et al. (2006) 

0.9-1.6 

0.861°;“ 

Rao et al. (2006) 

2.0-2.3 

0.872 ± 0.044 

Noterdaeme et al. (2012) 

2.3-2.6 

0.765 ±0.035 

Noterdaeme et al. (2012) 

2.6-2.9 

0.914 ±0.044 

Noterdaeme et al. (2012) 

2.9-3.2 

0.966 ±0.070 

Noterdaeme et al. (2012) 

3.2-3.5 

1.11 ± 0.11 

Noterdaeme et al. (2012) 

3.5-4.3 

0.82l°;“ 

Songaila (2010) 

4.3-5.1 

n yy+0.30 
‘ -0.27 

Songaila (2010) 


Table 2. Gjjl measurements from the literature shown in Figure [12] Each 
has been converted to a flat cosmology with Ho = 70kms“^ Mpc“^ and 
Gm,o = 0.3, and represent the mass density from Hi gas alone, without 
any contribution from helium or molecules. For previous analyses which 
quote the gas mass in DLAs, we have converted to Gui using 

Dm = ^HiGg where fi = 1.3 accounts for the mass of helium 

and <5hi = 1.2 estimates the contrib ution from systems below the DLA 
threshold of 10^'^'® cm ^ (see sectionj^J. 
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Figure 14. Measurements of Qui compared to recent theoretical predictions. For clarity, the mean of measurements at 2 < 0.2 (the errorbar shows the 
standard deviation) is shown. Lines show predictions from a recent semi-analytic model ^Lagos et al.|2014|, along with SPH ijTescari et al.|2009||Dave et al.| 
|2013) and moving-mesh i |Bird et al.|2Q14^ simulations. All the models have been converted to our adopted cosmology. While they do reproduce the roughly 
flat evolution of Qhi from 2 = 5 to 0 (in comparison to the cosmic star formation rate density), they do not match the data across the full redshift range. 


SPH simulations may have a strong influence on the predicted Qm 
(e.g. |Bird et al.|20T4 i. Furthermore, given the small sizes of DLAs 
(~ 5 kpc, Cooke et al. ] 20 T 0 l > it may also be important to correct 
for any smoothing over small-scale density peaks where DLAs are 
produced, and account for hydrodynamic instabilities which are not 
resolved by current cosmological simulations (e.g. jCrighton et al.| 
|20T5l >. 


6 SUMMARY 

We have measured Dhi at 3.5 < 2 < 5.3 using the Giant Gemini 
GMOS Survey, a homogeneous sample of 163 QSO spectra with 
emission redshifts > 4.4. All the QSOs were colour-selected from 
the SDSS survey and so have a well-understood selection function 
which is independent of any strong absorption in the QSO spec¬ 
tra. Using a combination of higher-resolution spectra of DLA can¬ 
didates and mock spectra, we explore systematic uncertainties in 
identifying DLAs due to strong IGM absorption at high redshift 
and the low spectral resolution of the GMOS spectra. The main 
conclusions from our analysis are: 

• We derive the most precise measurement of Ohi at (z) = 4.9 
to date, with a redshift path length at 2 > 4.5 a factor of eight larger 
than previous analyses. Ohi at 2 = 4.5 is consistent with the value 
measured at 2 = 3-3.5, and there is no evidence that Ohi evolves 
strongly over the Gyr period from redshift 5 to 3. There is also no 
evidence for an abrupt change in Ohi between 2 = 4 and 2 = 
5, which may be associated with a sudden change in metallicity 
reported at a similar redshift ( [Rafelski et al.pOlTt . However, such 
a change is not strictly ruled out by the data. 

• We quantify and correct for the fraction of spurious DLA can¬ 


didates, and for any DLAs missed in the low-resolution spectra, 
using higher resolution and mock spectra. We also estimate the un¬ 
certainty in the DLA column densities. For this DLA sample, the 
uncertainty introduced by these systematic effects on the Dhi mea¬ 
surement is smaller than the statistical uncertainties. 

• Using the higher resolution spectra and mocks we show that 
the typical uncertainty on the DLA Ahi and redshift is 0.2 dex and 
200 km s“^, respectively. Despite the increased IGM absorption at 
higher redshifts and the low spectral resolution, we find no strong 
systematic offset in the estimated Ahi for DLAs either as a function 
of redshift, or TVhi. 

• We find an excess in Dhi (30 ± 20 per cent) from the brighter 
half of our QSO sample compared to the fainter half. This is con¬ 
sistent with similar effects found in previous analyses at 2 ~ 2.5, 
which posited gravitational lensing as a possible explanation. Given 
the smaller Einstein radius at 2 = 4.5 compared to 2 = 2.5, for our 
sample this effect seems more likely to be caused by a statistical 
fluctuation. As such it should not significantly bias our result. 

• Recent theoretical models do not match the data across their 
full redshift range (2 = 5 to 0). A simple power law model of the 
form Dhi = A(1 + 2 )^ with A = (4.00 ± 0.24) x 10“"^ and 
7 = 0.60 ± 0.05, while not physically motivated, does describe the 
observations over the entire redshift range. 
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QSO name 

Zp,m 

Origin 

S/N 

GGG? 

Name 

■^min 

■^max 

-2^qso 

SDSS J000749.16+004119.6 

4.78 

ESI 

9.7 

no 

SDSS JOOl 115.23+144601.8 

4.037 

4.870 

4.970 

SDSS JOOl 115.23+144601.8 

4.97 

MAGE 

31.4 

yes 

SDSS J004054.65-091526.8 

4.046 

4.880 

4.980 

SDSS J005421.42-010921.6 

5.02 

ESI 

16.7 

no 

SDSS J010619.24+004823.3 

3.598 

4.358 

4.449 

SDSS J021043.16-001818.4 

4.77 

ESI 

7.9 

yes 

SDSS J012509.42-104300.8 

3.639 

4.406 

4.498 

SDSS J023137.65-072854.4 

5.42 

ESI 

26.2 

yes 

SDSS J021043.16-001818.4 

3.868 

4.674 

4.770 

SDSS J0331I9.66-074143.1 

4.73 

ESI 

17.8 

yes 

SDSS J023137.65-072854.4 

4.417 

5.313 

5.420 

SDSS J075618.10+410409.0 

5.06 

ESI 

10.8 

no 

SDSS J033119.66-074143.1 

3.838 

4.638 

4.734 

SDSS J075907.57+180054.7 

4.82 

ESI 

18.3 

yes 

SDSS J033829.30+002156.2 

4.096 

4.939 

5.040 

SDSS J081333.30+350811.0 

4.92 

ESI 

16.5 

no 

SDSS J073103.12+445949.4 

4.061 

4.898 

4.998 

SDSS J082454.02+130217.0 

5.21 

ESI 

22.5 

yes 

SDSS J075907.57+180054.7 

3.911 

4.723 

4.820 

SDSS J083122.60+404623.0 

4.89 

ESI 

19.6 

no 

SDSS J080023.01+305101.1 

3.789 

4.581 

4.676 

SDSS J083429.40+214025.0 

4.50 

ESI 

21.0 

no 

SDSS J080715.11+132805.1 

3.961 

4.782 

4.880 

SDSS J083920.53+352459.3 

4.78 

ESI 

13.9 

yes 

SDSS J081806.87+071920.2 

3.746 

4.531 

4.625 

SDSS J095707.67+061059.5 

5.18 

ESI 

18.1 

no 

SDSS J082212.34+160436.9 

3.649 

4.418 

4.510 

SDSS J100449.58+404553.9 

4.87 

ESI 

11.4 

no 

SDSS J082454.02+130217.0 

4.237 

5.103 

5.207 

SDSS J100416.12+434739.0 

4.87 

ESI 

19.7 

yes 





5.04 

ESI 

22.7 

Table 5. The start and end redshifts for each OSO used to calculate the 

SDSS J101336.30+424027.0 

no 





5.15 

ESI 

12.4 



SDSS J102833.46+074618.9 

no 

online. 




SDSS J104242.40+310713.0 

4.69 

ESI 

24.8 

no 




SDSS J105445.43+163337.4 

5.15 

ESI 

21.5 

yes 





SDSS J110045.23+112239.1 

4.73 

ESI 

23.4 

yes 





SDSS J110134.36+053133.8 

5.04 

ESI 

22.3 

yes 





SDSS J113246.50+120901.6 

5.18 

ESI 

32.7 

yes 





SDSS J114657.79+403708.6 

5.00 

ESI 

25.7 

yes 





SDSS J120036.72+461850.2 

4.74 

ESI 

19.0 

yes 





SDSS J120110.31+211758.5 

4.58 

ESI 

31.8 

yes 





SDSS J120207.78+323538.8 

5.30 

ESI 

25.6 

yes 





SDSS J120441.73-002149.6 

5.09 

ESI 

15.6 

yes 





SDSS J122042.00+444218.0 

4.66 

ESI 

11.3 

no 





SDSS J122146.42+444528.0 

5.20 

ESI 

15.5 

yes 





SDSS J123333.47+062234.2 

5.30 

ESI 

14.1 

yes 





SDSS J124515.46+382247.5 

4.96 

ESI 

16.4 

yes 





SDSS J125353.35+104603.1 

4.92 

ESI 

23.8 

yes 





SDSS J130215.71+550553.5 

4.46 

ESI 

24.0 

yes 





SDSS J131234.08+230716.3 

4.96 

ESI 

19.2 

yes 





SDSS J133412.56+122020.7 

5.13 

ESI 

10.8 

yes 





SDSS J134040.24+281328.1 

5.35 

ESI 

23.0 

yes 





SDSS J134015.03+392630.7 

5.05 

ESI 

17.7 

yes 





SDSS J141209.96+062406.9 

4.41 

ESI 

25.6 

yes 





SDSS J141839.99+314244.0 

4.85 

ESI 

15.3 

no 





SDSS J142103.83+343332.0 

4.96 

ESI 

24.1 

no 





SDSS J143751.82+232313.3 

5.32 

ESI 

24.4 

yes 





SDSS J143835.95+431459.2 

4.69 

ESI 

27.0 

yes 





SDSS J144352.94+060533.1 

4.89 

ESI 

5.6 

no 





SDSS J144331.17+272436.7 

4.42 

ESI 

24.4 

yes 





SDSS J151320.89+105807.3 

4.62 

ESI 

8.3 

yes 





SDSS J152345.69+334759.3 

5.33 

ESI 

8.6 

no 





SDSS J153459.75+132701.4 

5.04 

ESI 

4.9 

yes 





SDSS J153627.09+143717.1 

4.88 

ESI 

8.2 

no 





SDSS J160734.22+160417.4 

4.79 

ESI 

17.9 

yes 





SDSS J161425.13+464028.9 

5.31 

ESI 

13.1 

yes 





SDSS J162626.50+275132.4 

5.26 

ESI 

33.9 

yes 





SDSS J162629.19+285857.5 

5.04 

ESI 

12.0 

yes 





SDSS J165436.80+222733.0 

4.68 

ESI 

33.6 

no 





SDSS J165902.12+270935.1 

5.32 

ESI 

23.1 

yes 





SDSS J173744.87+582829.6 

4.91 

ESI 

12.9 

yes 





SDSS J221644.00+001348.0 

5.01 

ESI 

8.2 

no 





SDSS J225246.43+142525.8 

4.88 

ESI 

14.8 

yes 





SDSS J231216.40+010051.4 

5.07 

ESI 

4.8 

no 






Table 3. Higher resolution spectra used in our analysis. Columns list the 
QSO name, R.A. and Dec. (J2000), emission redshift, the instrument used 
to take the spectrum, the median S/N per pixel over rest-frame wavelengths 
M^R'X®lTOW,l[g^SO is in the GGG sample. 
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2JXP 

logio iVjxp 

^NC 

logic -^NC 

2hires 

logio iVhire. 

3 ^best 

^°glO -^best 

Label 

Hi-res exists? 

GGG? 

4.740 

20.25 

4.739 

20.40 

4.7395 

20.30 

4.7395 

20.30 

J0040-0915 

y 

y 

4.187 

20.50 




- 

4.1888 

20.60 

J0125-1043 

n 

y 

4,887 

20.75 

4.886 

20.70 

4.8836 

20.50 

4.8836 

20.50 

J0231-0728 

y 

y 

4.658 

20.95 

4.657 

21.00 

4.6576 

20.75 

4.6576 

20.75 

J0759+1800 

y 

y 

4.098 

21.05 

4.096 

21.05 

- 

- 

4.0985 

21.05 

J0800+3051 

n 

y 



4.472 

20.40 

4.4720 

20.30 

4.4720 

20.30 

J0824+1302 

y 

y 

4.830 

20.85 

4.830 

20.90 

4.8305 

20.75 

4.8305 

20.75 

J0824+1302 

y 

y 

4,341 

20.85 

4.343 

20.90 

4.3441 

20.60 

4.3441 

20.60 

J0831+4046 

y 

n 

3.713 

20.75 

3.712 

20.95 

3.7100 

20.75 

3.7100 

20.75 

J0834+2140 

y 

n 

4,391 

21.20 

4.391 

21.30 

4.3920 

21.15 

4.3920 

21.15 

J0834+2140 

y 

n 

4.424 

21.05 

4.425 

21.02 


- 

4.4227 

21.05 

J0854+2056 

n 

y 

4.795 

20.45 

4.794 

20.45 


- 

4.7945 

20.45 

J0913+5919 

n 

y 

3.979 

20.35 


- 

- 

- 

3.9790 

20.35 

J0941+5947 

n 

y 



4.862 

20.40 


- 


- 

J0957+0519 

y 

n 

4.473 

20.40 

4.472 

20.55 

- 

- 

- 

- 

J1004+4347 

y 

y 





4.4596 

20.75 

4.4596 

20.75 

J1004+4347 

y 

y 

4.798 

20.55 

4.805 

20.50 

4.7979 

20.60 

4.7979 

20.60 

J1013+4240 

y 

n 

4,257 

20.70 

4.259 

20.30 

- 

- 

4.2580 

20.50 

J1023+6335 

n 

y 

4.087 

20.70 

4.086 

20.90 

4.0861 

20.75 

4.0861 

20.75 

J1042+3107 

y 

n 





4.8165 

20.70 

4.8165 

20.70 

J1054+1633 

y 

y 





4.8233 

20.50 

4.8233 

20.50 

J1054+1633 

y 

y 

4.429 

20.85 




- 

- 


JllOO+1122 

y 

y 

4,397 

21.60 

4.395 

21.55 

4.3954 

21.65 

4.3954 

21.65 

JllOO+1122 

y 

y 

4.346 

21.40 

4.347 

21.35 

4.3441 

21.35 

4.3441 

21.35 

J1101+0531 

y 

y 

4,380 

21.20 


- 

4.3801 

21.15 

4.3801 

21.15 

J1132+1209 

y 

y 

5.015 

20.75 

5.015 

20.60 

5.0165 

20.70 

5.0165 

20.70 

J1132+1209 

y 

y 

4.476 

20.60 

4.476 

20,65 

4.4767 

20.45 

4.4767 

20.45 

J1200+4618 

y 

y 

3.799 

21.35 

3.807 

21.20 

3.7961 

21.25 

3.7961 

21.25 

J1201+2117 

y 

y 

4.156 

20.60 



4.1579 

20.50 

4.1579 

20.50 

J1201+2117 

y 

y 

4.793 

20.75 

4.798 

20.75 

4.7956 

21.10 

4.7956 

21.10 

J1202+3235 

y 

y 

4,811 

20.75 


- 

4.8106 

20.75 

4.8106 

20.75 

J1221+4445 

y 

y 

4.926 

20.35 

4.931 

20.70 

4.9311 

20.55 

4.9311 

20.55 

J1221+4445 

y 

y 

4,711 

20.50 


- 

- 

- 

- 

- 

J1233+0622 

y 

y 

4.448 

20.80 

4.447 

20.70 

4.4467 

20.45 

4.4467 

20.45 

J1245+3822 

y 

y 

4,213 

20.50 

4.213 

20.40 

- 

- 

4.2130 

20.45 

J1301+2210 

n 

y 

3.937 

21.10 

3.937 

21.10 

- 

- 

3.9387 

21.10 

J1309+1657 

n 

y 

4.303 

20.55 

4.303 

20.50 


- 

4.3027 

20.52 

J1332+4651 

n 

y 





4.7636 

20.35 

4.7636 

20.35 

J1334+1220 

y 

y 

4.348 

20.55 

4.348 

20.50 


- 

4.3480 

20.52 

J1337+4155 

n 

y 

5,003 

20.85 


- 

- 

- 

- 

- 

J1340+2813 

y 

y 



5.096 

20.30 

- 

- 

- 

- 

J1340+2813 

y 

y 

4.826 

21.05 

4.827 

21.05 

4.8258 

21.20 

4.8258 

21.20 

J1340+3926 

y 

y 

4,109 

20.35 


- 

4.1093 

20.35 

4.1093 

20.35 

J1412+0624 

y 

y 



4.322 

20.40 


- 



J1418+3142 

y 

n 

3.958 

20.55 


- 

- 

- 

3.9628 

21.00 

J1418+3142 

y 

n 

4,453 

20.35 

4.453 

20.45 

- 

- 

- 

- 

J1418+3142 

y 

n 

4.114 

20.60 

4.112 

20.70 


- 

4.1140 

20.65 

J1420+6155 

n 

y 



4.665 

20.35 

4.6644 

20.30 

4.6644 

20.30 

J1421+3433 

y 

n 

4.093 

20.30 




- 

4.0929 

20.30 

J1427+3308 

n 

y 

4.526 

20.60 

4.527 

20.60 


- 

4.5218 

20.60 

J1436+2132 

n 

y 

4.800 

21.10 

4.801 

21.10 

4.8007 

21.20 

4.8007 

21.20 

J1437+2323 

y 

y 

4.400 

20.80 

4.398 

20.85 

4.3989 

20.80 

4.3989 

20.80 

J1438+4314 

y 

y 



4.355 

20.35 

- 

- 

- 

- 

J1443+0605 

y 

n 

4.223 

20.95 

4.222 

21.05 

4.2237 

20.95 

4.2237 

20.95 

J1443+2724 

y 

y 

4.088 

21.45 

4.089 

21.57 


- 

4.0885 

21.51 

J1511+0408 

n 

y 

4,304 

21.05 

4.305 

21.10 

- 

- 

4.3043 

21.08 

J1524+1344 

n 

y 

3.818 

20.45 

3.817 

20.30 


- 

3.8175 

20.38 

J1532+2237 

n 

y 



4.466 

20.30 

4.4740 

20.40 

4.4740 

20.40 

J1607+1604 

y 

y 

4.915 

20.90 

4.912 

21.00 

4.9091 

21.00 

4.9091 

21.00 

J1614+4640 

y 

y 

4.462 

20.70 

4.462 

20.85 


- 



J1626+2751 

y 

y 

4,312 

21.20 

4.313 

21.30 

4.3105 

21.30 

4.3105 

21.30 

J1626+2751 

y 

y 

4.498 

20.95 

4.498 

21.00 

4.4973 

21.05 

4.4973 

21.05 

J1626+2751 

y 

y 

4,605 

20.55 


- 

4.6067 

20.55 

4.6067 

20.55 

J1626+2858 

y 

y 

4.083 

20.60 

4.082 

20.50 


- 

4.0825 

20.55 

J1634+2153 

n 

y 



4.101 

20.60 


- 


- 

J1654+2227 

y 

n 

4.001 

20.60 

4.003 

20.75 

4.0023 

20.55 

4.0023 

20.55 

J1654+2227 

y 

n 

4.742 

20.70 

4.740 

20.60 

4.7424 

20.80 

4.7424 

20.80 

J1737+5828 

y 

y 





4.7475 

20.55 

4.7475 

20.55 

J2252+1425 

y 

y 

4,257 

21.10 

4.256 

20.80 

- 

- 

- 

- 

J2312+0100 

y 

n 


Table 4. DLAs identified in the GGG spectra and and other higher-resolution spectra. The first four columns list the redshift and Nm estimate in the GMOS- 
resolution spectra by two of us (JXP and NHMC). The fifth and sixth columns give measurements from a high-resolution spectrum of the QSO, if one exists. 
The seventh and eight columns give the ‘best’ estimate of for the DLA redshift and Ahi- This is value from the high-resolution spectrum if one exists, 
otherwise it is the mean of the estimates from JXP and NHMC. In the first eight columns, a dash means no DLA was identified. The ninth column gives the 
QSO name, and the tenth column lists whether the QSO has high-resolution spectrum. The last column lists whether the QSO is part of the GGG sample used 
to measure Ohi- 
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APPENDIX A: MOCK SPECTRA 

We generated a set of mock spectra to quantify the reliability and complete¬ 
ness of our DLA candidates in the low-resolution GMOS spectra. Here we 
describe how these mocks were produced. 

One mock spectrum was generated for each real GMOS spectnam, as¬ 
suming the same noise properties and the same QSO redshift. Therefore 
the sample of mocks has the same redshift and S/N distribution to the real 
GGG spectra. We model the forest absorption by a distribution of Voigt 
profiles. Due the difficulty of profile-fitting the strongly absorbed Lya for¬ 
est at high redshifts, the Nm, b and 2 distribution of Lya forest lines at 
2 > 4 is not well known. However the distribution at 2 ~ 2.5 has been 
measured (e.g. |Kim et al.|2Q13||Rudie et al.|2013t . Therefore we assume 
the shape of /(Vhi) at 2 4-5 is the same as that used by|Prochaska| 

|et al.|(2Q14) at 2 ~ 2.5, and increase its normalisation until the mean flux 
of the mock spectra at 2 = 4.5 matches the val ue from|Becker et 31.^2013^ . 
DLAs were generated using /(Ajili from |0’Meara et al.||2Q13) , and 
we assume /{Nm) is redshift-independent, whereas fiNm^X) evolves 
as (1 +2)1-5. 

We initially did not include any line clustering in the Lya forest, but 
found that this produced spectra which were markedly different to the real 
spectra: there were too few regions with very strong absorption and also 
too few regions with low absorption. To address this we introduced line 
clustering, similar to that used by |Liskc et al.| j2Q08^ to model the Lya 
forest at 2 ~ 3. This involves generating absorption at ‘clump’ positions 
rather than individual lines. For each clump, 0, 1 or more lines are produced, 
with the number taken from a Borel distribution jSaslaw|1989^ with /3 = 
0.6. Each line in a clump is offset from the clump redshift by a velocity 
drawn from a Gaussian distribution with cr = 250 km s“ 1 . These values of 
/3 and cr were chosen by a parameter grid search, varying each until values 
were found which produce in mock spectra with a Lya forest which match 
the flux distribution of the real GMOS spectra. The number of clumps was 
set such that the mean transmission in the Lya forest matches the effective 
optical depth at 2 = 4.5 derived by |Becker et al.| l [2013^ . 

We then generated a QSO continuum from the PCA components pre¬ 
sented by |Suzuki et al.|j2005) , derived using a sample of low redshift QSOs 
observed with the UV Faint Object Spectrograph. We set the QSO redshift 
to that of the matching GMOS QSO, and added noise to the mock using 
the same noise aiTay as the GMOS spectrum, normalised so that the median 
S/N of the mock and the real spectra in the range 7600-7800 A matches. 
Using the noise array from the real spectra for the mocks is an approxima¬ 
tion, as the noise properties vary with the QSO spectrum (strong absorbers 
and strong emission lines affect the noise level). However, the variations in 
noise due to these effects is small in the Lya forest, so we believe this is a 
good approximation. 

Figure [AT] shows three example mock spectra and their corresponding 
real spectra, selected at random from our sample. The Lya forest distribu¬ 
tion in the mocks matches closely the distribution seen in the real spectra. 
We do not expect these mocks to correctly reproduce the mean optical depth 
at the Lyman limit or the power spectrum of Lya flux absorption. However, 
our aim is not to reproduce all properties of the real spectra. Instead we aim 
to create mock spectra which match by eye the Lya forest at GMOS reso¬ 
lution, the most important characteristic for DLA identification. 

We did not include metal absorption in the mocks. The similarity be¬ 
tween the mocks and the real spectra, and the agreement between the correc¬ 
tion factors fcreai ^found derived from the mocks and high-resolution 
spectra suggest their inclusion is unnecessary. 

A1 High Nm DLAs 

DLAs in the column density range Nm = cm~^ make the 

dominant contribution to Ohi, and it is thus important to con'ectly mea¬ 
sure the uncertainty in /Creai and fcfound for this Nm range. There are only 
~ 10 DLAs in this column density range in both the mocks and the high- 



1000 1050 1100 1150 1200 1250 

'^rest (A) 


Figure Al. Three mock GMOS spectra, selected at random, with their cor¬ 
responding real spectra. The real and mock spectra are normalised in the 
rest frame wavelength region 940-1200 A and offset for clarity. The flux 
distribution in the Lya forest (between the two dotted vertical lines), where 
we search for DLAs, is very similar. The thin green lines show the Icr error 
array. 

resolution sample, so the uncertainties in this correction are large. Therefore 
we generated further mocks with an enhanced incidence rate of high Nm 
systems. We did this by generating 10 times more mocks than were used 
above, using the same line distribution. Due to time constraints, we were 
unable to search by eye every one of these mocks. Instead we selected just 
100 spectra: the 50 containing the highest Nm DLAs, and a further 50 se¬ 
lected at random from the remainder. This formed a sample of 100 new 
mock spectra which we searched for high Nm systems. 50 were included 
without requiring a DLA to present so that when scanning the spectra by 
eye, the searcher would not be certain that every spectrum contains a DLA. 
The kfounds kj-cai values found by including these extra sightlines into our 
mock sample are shown in Figures [A2| and [A3] These show that the prob¬ 
ability of a spurious DLA at Nm > 10^^ cm~^ is just l%-5%, using 
binomial statistics with 

The log Nm velocity differences between the candidate and true 
values are shown in Figure [A^ This shows that even at high Nm^ there is 
no strong systematic offset from the true value. 
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Figure A2. The fraction of true DLAs that were correctly identified by one 
of the authors (NHMC), fcfound^ a function of the true redshift and A^hi 
found using the mock spectra. This includes the mock sightlines with addi¬ 
tional strong A^hi DLAs. 
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Figure A3. The fraction of non-spurious DLA candidates, fcreai by one of 
the authors (NHMC), as a function of the candidate redshift and Nm for 
the mock spectra. This includes the mock sightlines with additional strong 
Niii DLAs. 
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Figure A4. The difference between Ahi estimated for DLA candidates 
in low resolution spectra (A^hi cand) nnd A^Hi,true measured from mock 
linelists by one of the authors (NHMC), including the extra sightlines 
with additional strong Nm DLAs. This shows there is no strong sys¬ 
tematic offset in the estimated Nm as a function of redshift, even for 
Nm cm-2 DLAs. 
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